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THE MIRROR TRACING TEST AS A DIAGNOSTIC AID 
FOR EMOTIONAL INSTABILITY! 


C. M. Loutrtirt, Lt. Comdr., U.S.N.R. 
Bureau of Naval Personnel, Washington, D. C. 


Tracing a star in mirror vision has, according to Whipple (1915), 
been in use in psychological laboratories for studies on learning and 
muscular movement for half a century. The use of this technique in 
studying the efhciency of motor performance as an indicator of 
personal stability was apparently first reported by Weidensall (1916). 
She found the average times in seconds taken on first trials (size of 
star not stated) to be: Bedford Reformatory inmates, 473.1; college 
maids, 133.6; and college students (women), 82.6. Error scores, i.e. 
quality of performance for the same groups were 204.6, 58.1, and 
46.8 respectively. While the data are not presented as completely 
as one would like the author’s confidence in the test is expressed thus: 
“this test isolates better than any we have tried at Bedford those who 
are incapable of sustained effort.” 

Holsopple (1932) divided male reformatory inmates by the quality 
of their star tracings into the poorest and best groups of 40 from a 
total population of 200 and compared certain items of their his- 
tories. Men in the poor performance group had had 105 arrests 
before conviction, while those in the good group had only 74; the 
poorest group had 52 reports for infractions of institution rules, 
while the best had only 37. Without presenting detailed evidence 
Holsopple claims definite value in this test for indicating instability 
and says, “those recidivists who seem to be rather the victims of an 
overwhelmingly unfavorable environment than to have deep-seated 
personal disabilities show a minimum of difficulties in their mirror 
drawing.” Bois (1937), using a score combining time and errors 
found with 67 subjects that the distribution gave a J-shaped curve, 
with 62 subjects having scores between 50 and 500 points and 5 -sub- 
jects scoring between 600 and 950 points. Each of these five subjects 
were found to be “particularly feeble in emotional control.” 

1 Opinions expressed herein are those of the writer only and are not to 


be construed as official or reflecting the views of the Navy Department or 
the Naval service at large. 
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The essential agreement among these three studies, and the lack 
of contrary evidence would seem to justify further experimental ex- 
ploration of the method. 


METHOD 


Apparatus. While the essentials of the mirror tracing apparatus 
are simple there has not been uniformity in the details among different 
investigators. Therefore, in the interests of standardization the appa- 
ratus used in this research is described. A mirror is mounted across 
one end of base board of 3 ply-wood, 16 x 12 1/5 inches. In front of 
this at about 7.5 inches is a frame of 1 x 0.75 inch wood to which are 
fastened angular blocks on either side to support a board screen. The 
frame is open at the base so that the records sheets can be fastened to 
the base board. These dimensions should be considered approximate, 
although they have been determined by experiment to be satisfactory. 

The record blanks have a mimeographed six-pointed star with 
necessary blanks for identification and record. The star is inscribed 
in a circle with a 2.75 inch radius. It is placed on the sheet with one 
of the longest axes at an 18° angle from the vertical. In placing the 
sheet in the baseboard it is fastened with the lower edge even with the 
edge of the baseboard toward the subject. The starting point, which is 
toward the subject is marked with a short cross line and an arrow 
indicating the direction of initial movement. 

Directions. With the sheet properly fastened the subject was 
allowed to place his pencil on the cross mark in direct vision. He 
was then told to look only in the mirror and the following directions 
were given: “Trace the outline of the star going in the direction of 
the arrow. Work as rapidly as you can but try to keep on the line. 
Keep your pencil on the paper all the time.” 

Scoring: Time. The time of performance was taken with a stop 
watch, from the first movement of the pencil until return to the cross 
line. Time was recorded only to the nearest quarter minute. The 
time for subjects who refused to complete the performance was taken 
at 20 minutes. In this experiment no time limit was imposed, but 
the data indicated that a limit of 10 or even 5 minutes may be suit- 
able in practice. 

Scoring: Quality. There is wide variation in the quality of 
performance with a continuous degree of gradation. Several methods 
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of measuring the quality of the performance were tried, none of 
which resulted in any differentials between the experimental and 
control groups. The simplest method of scoring quality was by rating 
the records on a five-point scale with the four groups roughly defined 
as follows (no partial credits given) : 

1. Good to excellent. All lines coincident with legs; not more 
than two small stops; not more than 3 lines off legs and these parallel 
to legs at not more than 1 millimeter distance; no cross lines; no 
corners cut. 

2. Fair to good. Lines off not more than 3 millimeters but parallel 
to legs; more than two steps; lines not even or non-continuous; some, 
but not excessive, saw-tooth or looping lines. 

3. Fair. Many cross lines; lines way off legs but fairly even and 
parallel; many stops; irregular loops. 

4. Fair to poor. Majority of lines way off legs; excessive saw- 
tooth lines; excessive looping and stopping. In general very poor 
performance but evident attempt to follow the legs. 

5. Poor and failure. Scribbling; failure to follow legs; short cuts 
so as to avoid at least two sides of one point; several lines drawn 
inside or outside of star perpendicular to legs; failure to complete 
within time limits; refusal to complete. 

Combined scores. For a single score of performance the time scores 
in minutes have been weighted by multiplying by the quality scores. 
However, it will be evident that the time score alone is satisfactory. 

Subjects. Data were secured from 86 problem men, the experi- 
mental group; and 82 normal men as a control group, making a total 
of 168. The experimental group included 58 patients at the psychiatric 
ward of the Naval Hospital, Washington, D. C. The diagnoses of 
these men were chiefly constitutional psychopathic state, psycho- 
neuroses, and incipient dementia praecox, with several having no 
diagnoses but admitted for observation. The remaining 28 in this 
group were prisoners at the Indiana State Prison Farm, a minimal 
security institution with a high proportion of vagrants, drunkards, 
and similar types of problems.! As a group these men represented a 
cross section of the mild and borderline behavior problems indicative 
of unstable personality and adjustmental inadequacies. The control 


1 These data were secured through the courtesy of Mr. L. D. Cohen, 
Supervisor of Classification. 
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group was composed of 51 enlisted men and petty officers of the 
Hospital Corps on duty or under instruction at the Naval Medical 
School, 21 convalescent patients from the medical or surgical wards 
of the Naval Hospital and a miscellaneous group of 10 including 
medical officers and civilians. 


RESULTS 


Time scores. The essential raw data for the times on first trial 
are given in table 1. These data are given in extenso in order to show 
the essential similarity of the distributions for the sub-groups of experi- 
mental and control subjects. The average time taken by the ex- 
perimental subjects was 5.38 minutes and for the controls only 2.68 
minutes. The difference between these means has a C.R. of 5.0, 
which is slightly larger but of the same order as C.R.’s calculated on 
fewer cases at earlier stages of the investigation. While the size of 
this critical ratio indicates practical certainty that similar comparisons 
made on new groups of subjects would show a difference similar to 


TABLE 1 


DisTRIBUTION OF TIME SCORES 


Time Experimental Control 
N.P. 
Pa- Prison- Corps Pa- 
Min. tients ers Total To Men tients Misc. Total % 
1 9 9 10.5 12 7 19 23.2 
Z 13 1 14 16.3 10 8 a 22 26.8 
3 10 6 16 18.6 16 2 2 20 24.4 
4 6 5 11 12.8 8 p 2 12 14.6 
5 3 4 | 8.1 3 1 1 5 6.1 
6 7 5 12 13.9 2 1 1 a 4.9 
7 3 1 4 4.7 
8 1 1 y 2.3 
9 1 1 2 
10 0 0 0 

11-20 3 i 4 4.7 

DNC 3 5.8 

Totals 58 28 86 51 21 10 82 

Mean 

(minutes) 5.38 2.68 

standard dev. 4.86 1.39 

median 3.0 
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this one, it is not of great usefulness in determining the clinical use- 
fulness of the tool. For this purpose data en the proportions of the 
experimental groups which exceed certain score values of the control 
distribution are more significant. There were 81.3 percent of the 
experimental subjects who exceeded the median time of the control 
group; 33.8 percent of the experimental group exceeded 5 minutes in 
the time required, while only 4.0 percent of the control group did; 
finally 19.8 percent of the experimental group exceeded 6 minutes 
while none of the control groups required as much time. 

Qualitative performance. The five classes of qualitative per- 
formance were empirically established by sorting all 168 records with- 
out distinction as to the group in which they fell. The lack of any 
statistically significant difference in the quality of performance be- 
tween the experimental and control group is shown in table 2. 


TABLE 2 
DIsTRIBUTION OF QUALITATIVE PERFORMANCE SCORES 
Control Group Experimental Group 
Score Number Percent Number Percent 

1 6 7.3 , 11 12.8 
2 19 23.2 12 13.9 
3 36 . 43.9 31 36.1 
4 16 19.5 19 22.1 
5 5 6.1 13 19.2 


It is evident that the control subjects cluster about the mean some- 
what more than do the experimental subjects. The differences in the 
percentages falling into the various quality classes for the two groups 
are not great: the difference in class 5 is greatest but this one is 
only 1.9 times its standard error. This finding of a lack of difference 
in quality of performance is further supported by an analysis of 
part of these data with a much more detailed scoring method. This 
was done with 68 control and 20 experimental subjects. The time and 
the qualitative performances showed no real differences. 

Combined scores. In spite of the lack of significant difference 
between the quality score distributions for the two groups it is of some 
interest to analyze the combined scores. A coefficient of contingency 
of .078 between time and quality (using five steps for each measure) 
indicates a lack of relationship between these measures. This may be 
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taken to mean that time and accuracy are measures of quite different 
performances, and, therefore, time weighted by accuracy might give 
significant scores. The essential data, without the complete distribu- 
tion, for the combined scores are given in table 3. 


TABLE 3 
STATISTICS FOR COMBINED SCORES 
Control Experimental 
Standard deviation .............. 5.45 23.35 
Percent exceed control median...... ee 50 67.9 
Percent exceed score of 20........... 2.4 233 
Percent exceed score of 30..:°.......... 15.0 


The critical ratio of the differences between the means is 4.9, indi- 
cating practical statistical certainty. The percentages of experimental 
subjects exceeding certain critical point are all slightly lower than 
the figures given for analagous points in the time scores. 


DISCUSSION 


The task presented by the star-tracing situation is that of adjust- 
ing to a reversal of the well established hand-eye coordination. It is 
evident from our own data and the other studies referred to that some 
subjects can do this readily while others require a long time, are very 
inaccurate, or even refuse to complete the task. There is no indication 
in these data that mental ability level is a significant factor in the 
performance. While not overtly evident in the records, observation 
of the subjects performing suggests that certain persons are calm and 
straightforward in their attack. Others are evidently emotionally 
disturbed, they are tense, make excessive movements, often concen- 
trate at one point, may verbalize, and may even refuse to continue. As 
a working hypothesis we may say that subjects who show disturb 
ance on the simple task presented by mirror-tracing will probably 
show disturbance when required to make other unfamiliar adjust- 
ments under pressure in non-laboratory situations. It was impossible 
to make detailed individual studies on all of our subjects, therefore 
we cannot attempt to test this hypothesis by individual comparisons 
of personality traits with performance. However, at least in a statis- 
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tical sense, the fact that a group of subjects selected because they had 
not been able to adjust in varying degrees had poorer performances 
than a group of adequately adjusted men would seem to support the 
hypothesis. Furthermore, the following average times for several 
diagnostic groups of the psychiatric ward patients are suggestive: 


constitution psychopathic states 11 cases 4.7 minutes 


psychoneuroses 
dementia praecox 
mental observation 


The small number of cases involved in these sub-groups make the 
averages statistically meaningless, but it is of interest to note that 
these men who were in the hospital for observation only had lower 
time scores than those for whom a diagnosis had been made. 

The data do not allow us to make definite statements concerning 
the meaning of scores in individual cases, with one exception. Since 
20 percent of the experimental subjects took more than 6 minutes 
whereas none of the control subjects did, it seems reasonable to con- 
sider any subject requiring more than 6 minutes sufficiently unstable 
to warrant careful investigation. If to this critical time score we add 
a critical weighted score of 30 (above which 15 per cent of the experi- 
mental cases fell) we find that there are 23.2 percent of the ex- 
perimental subjects included and none of the controls. Thus, it seems 
that high scores, time and weighted, on star tracing may be taken as 
indicating instability, but scores below the critical points cannot be 
taken to mean adequate stability. 


SUMMARY AND CONCLUSION 


1. 168 men traced a six-pointed star in mirror vision; 86 of these 
were behavior problems, and 82 were non-problem. 

2. The average time score alone, and the average time weighted 
by quality scores were significantly greater for the problem than for 
the non-problem group. 

3. The data showed critical points at 6 minutes and 30 points on 
the weighted scores. Above these points there were none in the 
control group, while there were 17.2 and 15 percent respectively in 
the experimental group. Selection using both criteria separated 23.2 
percent of the experimental group. 
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4. It is suggested that subjects having scores higher than these 
critical points should be subject to careful examination for personal 
stability. 
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